Overview

Dataset statistics

Number of variables28
Number of observations119390
Missing cells129425
Missing cells (%)3.9%
Duplicate rows8237
Duplicate rows (%)6.9%
Total size in memory24.7 MiB
Average record size in memory217.0 B

Variable types

Categorical12
Numeric14
Boolean1
Text1

Alerts

Dataset has 8237 (6.9%) duplicate rowsDuplicates
id_travel_agency_booking is highly overall correlated with typeHigh correlation
type is highly overall correlated with id_travel_agency_bookingHigh correlation
num_children is highly imbalanced (80.7%)Imbalance
num_babies is highly imbalanced (97.2%)Imbalance
distribution_channel is highly imbalanced (63.2%)Imbalance
repeated_guest is highly imbalanced (79.6%)Imbalance
reserved_room is highly imbalanced (58.3%)Imbalance
deposit_policy is highly imbalanced (65.3%)Imbalance
customer_type is highly imbalanced (50.6%)Imbalance
required_car_parking_spaces is highly imbalanced (85.4%)Imbalance
id_travel_agency_booking has 16340 (13.7%) missing valuesMissing
id_person_booking has 112593 (94.3%) missing valuesMissing
num_previous_cancellations is highly skewed (γ1 = 24.45804872)Skewed
num_previous_stays is highly skewed (γ1 = 23.53979995)Skewed
days_between_booking_arrival has 6345 (5.3%) zerosZeros
num_weekend_nights has 51998 (43.6%) zerosZeros
num_workweek_nights has 7645 (6.4%) zerosZeros
market_segment has 12606 (10.6%) zerosZeros
num_previous_cancellations has 112906 (94.6%) zerosZeros
num_previous_stays has 115770 (97.0%) zerosZeros
changes_between_booking_arrival has 101314 (84.9%) zerosZeros
avg_price has 1960 (1.6%) zerosZeros
total_of_special_requests has 70318 (58.9%) zerosZeros

Reproduction

Analysis started2024-02-06 10:33:37.145233
Analysis finished2024-02-06 10:35:12.705780
Duration1 minute and 35.56 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

cancellation
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
75166 
1
44224 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

Length

2024-02-06T10:35:12.883042image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:13.153134image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

Most occurring characters

ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

Most occurring scripts

ValueCountFrequency (%)
Common 119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 75166
63.0%
1 44224
37.0%

type
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
Hotel
79330 
Fancy Hotel
40060 

Length

Max length11
Median length5
Mean length7.0132339
Min length5

Characters and Unicode

Total characters837310
Distinct characters11
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFancy Hotel
2nd rowFancy Hotel
3rd rowFancy Hotel
4th rowFancy Hotel
5th rowFancy Hotel

Common Values

ValueCountFrequency (%)
Hotel 79330
66.4%
Fancy Hotel 40060
33.6%

Length

2024-02-06T10:35:13.365100image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:13.647213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
hotel 119390
74.9%
fancy 40060
 
25.1%

Most occurring characters

ValueCountFrequency (%)
H 119390
14.3%
o 119390
14.3%
t 119390
14.3%
e 119390
14.3%
l 119390
14.3%
F 40060
 
4.8%
a 40060
 
4.8%
n 40060
 
4.8%
c 40060
 
4.8%
y 40060
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 637800
76.2%
Uppercase Letter 159450
 
19.0%
Space Separator 40060
 
4.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 119390
18.7%
t 119390
18.7%
e 119390
18.7%
l 119390
18.7%
a 40060
 
6.3%
n 40060
 
6.3%
c 40060
 
6.3%
y 40060
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
H 119390
74.9%
F 40060
 
25.1%
Space Separator
ValueCountFrequency (%)
40060
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 797250
95.2%
Common 40060
 
4.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
H 119390
15.0%
o 119390
15.0%
t 119390
15.0%
e 119390
15.0%
l 119390
15.0%
F 40060
 
5.0%
a 40060
 
5.0%
n 40060
 
5.0%
c 40060
 
5.0%
y 40060
 
5.0%
Common
ValueCountFrequency (%)
40060
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 837310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 119390
14.3%
o 119390
14.3%
t 119390
14.3%
e 119390
14.3%
l 119390
14.3%
F 40060
 
4.8%
a 40060
 
4.8%
n 40060
 
4.8%
c 40060
 
4.8%
y 40060
 
4.8%

days_between_booking_arrival
Real number (ℝ)

ZEROS 

Distinct479
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean104.01142
Minimum0
Maximum737
Zeros6345
Zeros (%)5.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:13.886902image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median69
Q3160
95-th percentile320
Maximum737
Range737
Interquartile range (IQR)142

Descriptive statistics

Standard deviation106.8631
Coefficient of variation (CV)1.027417
Kurtosis1.6964488
Mean104.01142
Median Absolute Deviation (MAD)60
Skewness1.3465499
Sum12417923
Variance11419.722
MonotonicityNot monotonic
2024-02-06T10:35:14.187048image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6345
 
5.3%
1 3460
 
2.9%
2 2069
 
1.7%
3 1816
 
1.5%
4 1715
 
1.4%
5 1565
 
1.3%
6 1445
 
1.2%
7 1331
 
1.1%
8 1138
 
1.0%
12 1079
 
0.9%
Other values (469) 97427
81.6%
ValueCountFrequency (%)
0 6345
5.3%
1 3460
2.9%
2 2069
 
1.7%
3 1816
 
1.5%
4 1715
 
1.4%
5 1565
 
1.3%
6 1445
 
1.2%
7 1331
 
1.1%
8 1138
 
1.0%
9 992
 
0.8%
ValueCountFrequency (%)
737 1
 
< 0.1%
709 1
 
< 0.1%
629 17
< 0.1%
626 30
< 0.1%
622 17
< 0.1%
615 17
< 0.1%
608 17
< 0.1%
605 30
< 0.1%
601 17
< 0.1%
594 17
< 0.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
2016
56707 
2017
40687 
2015
21996 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters477560
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2015
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2016 56707
47.5%
2017 40687
34.1%
2015 21996
 
18.4%

Length

2024-02-06T10:35:14.506134image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:14.777566image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2016 56707
47.5%
2017 40687
34.1%
2015 21996
 
18.4%

Most occurring characters

ValueCountFrequency (%)
2 119390
25.0%
0 119390
25.0%
1 119390
25.0%
6 56707
11.9%
7 40687
 
8.5%
5 21996
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 477560
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 119390
25.0%
0 119390
25.0%
1 119390
25.0%
6 56707
11.9%
7 40687
 
8.5%
5 21996
 
4.6%

Most occurring scripts

ValueCountFrequency (%)
Common 477560
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 119390
25.0%
0 119390
25.0%
1 119390
25.0%
6 56707
11.9%
7 40687
 
8.5%
5 21996
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 477560
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 119390
25.0%
0 119390
25.0%
1 119390
25.0%
6 56707
11.9%
7 40687
 
8.5%
5 21996
 
4.6%
Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
August
13877 
July
12661 
May
11791 
October
11160 
April
11089 
Other values (7)
58812 

Length

Max length9
Median length7
Mean length5.9031828
Min length3

Characters and Unicode

Total characters704781
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJuly
2nd rowJuly
3rd rowJuly
4th rowJuly
5th rowJuly

Common Values

ValueCountFrequency (%)
August 13877
11.6%
July 12661
10.6%
May 11791
9.9%
October 11160
9.3%
April 11089
9.3%
June 10939
9.2%
September 10508
8.8%
March 9794
8.2%
February 8068
6.8%
November 6794
5.7%
Other values (2) 12709
10.6%

Length

2024-02-06T10:35:15.015738image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
august 13877
11.6%
july 12661
10.6%
may 11791
9.9%
october 11160
9.3%
april 11089
9.3%
june 10939
9.2%
september 10508
8.8%
march 9794
8.2%
february 8068
6.8%
november 6794
5.7%
Other values (2) 12709
10.6%

Most occurring characters

ValueCountFrequency (%)
e 95619
13.6%
r 78190
 
11.1%
u 65351
 
9.3%
b 43310
 
6.1%
a 41511
 
5.9%
y 38449
 
5.5%
t 35545
 
5.0%
J 29529
 
4.2%
c 27734
 
3.9%
A 24966
 
3.5%
Other values (16) 224577
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 585391
83.1%
Uppercase Letter 119390
 
16.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 95619
16.3%
r 78190
13.4%
u 65351
11.2%
b 43310
 
7.4%
a 41511
 
7.1%
y 38449
 
6.6%
t 35545
 
6.1%
c 27734
 
4.7%
m 24082
 
4.1%
l 23750
 
4.1%
Other values (8) 111850
19.1%
Uppercase Letter
ValueCountFrequency (%)
J 29529
24.7%
A 24966
20.9%
M 21585
18.1%
O 11160
 
9.3%
S 10508
 
8.8%
F 8068
 
6.8%
N 6794
 
5.7%
D 6780
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 704781
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 95619
13.6%
r 78190
 
11.1%
u 65351
 
9.3%
b 43310
 
6.1%
a 41511
 
5.9%
y 38449
 
5.5%
t 35545
 
5.0%
J 29529
 
4.2%
c 27734
 
3.9%
A 24966
 
3.5%
Other values (16) 224577
31.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 704781
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 95619
13.6%
r 78190
 
11.1%
u 65351
 
9.3%
b 43310
 
6.1%
a 41511
 
5.9%
y 38449
 
5.5%
t 35545
 
5.0%
J 29529
 
4.2%
c 27734
 
3.9%
A 24966
 
3.5%
Other values (16) 224577
31.9%

week_number_arrival_date
Real number (ℝ)

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.165173
Minimum1
Maximum53
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:15.293010image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q116
median28
Q338
95-th percentile49
Maximum53
Range52
Interquartile range (IQR)22

Descriptive statistics

Standard deviation13.605138
Coefficient of variation (CV)0.50083018
Kurtosis-0.98607718
Mean27.165173
Median Absolute Deviation (MAD)11
Skewness-0.010014326
Sum3243250
Variance185.09979
MonotonicityNot monotonic
2024-02-06T10:35:15.598838image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 3580
 
3.0%
30 3087
 
2.6%
32 3045
 
2.6%
34 3040
 
2.5%
18 2926
 
2.5%
21 2854
 
2.4%
28 2853
 
2.4%
17 2805
 
2.3%
20 2785
 
2.3%
29 2763
 
2.3%
Other values (43) 89652
75.1%
ValueCountFrequency (%)
1 1047
0.9%
2 1218
1.0%
3 1319
1.1%
4 1487
1.2%
5 1387
1.2%
6 1508
1.3%
7 2109
1.8%
8 2216
1.9%
9 2117
1.8%
10 2149
1.8%
ValueCountFrequency (%)
53 1816
1.5%
52 1195
1.0%
51 933
0.8%
50 1505
1.3%
49 1782
1.5%
48 1504
1.3%
47 1685
1.4%
46 1574
1.3%
45 1941
1.6%
44 2272
1.9%

day_of_month_arrival_date
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.798241
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:15.907768image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile30
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.7808295
Coefficient of variation (CV)0.55581058
Kurtosis-1.1871683
Mean15.798241
Median Absolute Deviation (MAD)8
Skewness-0.002000454
Sum1886152
Variance77.102966
MonotonicityNot monotonic
2024-02-06T10:35:16.180772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
17 4406
 
3.7%
5 4317
 
3.6%
15 4196
 
3.5%
25 4160
 
3.5%
26 4147
 
3.5%
9 4096
 
3.4%
12 4087
 
3.4%
16 4078
 
3.4%
2 4055
 
3.4%
19 4052
 
3.4%
Other values (21) 77796
65.2%
ValueCountFrequency (%)
1 3626
3.0%
2 4055
3.4%
3 3855
3.2%
4 3763
3.2%
5 4317
3.6%
6 3833
3.2%
7 3665
3.1%
8 3921
3.3%
9 4096
3.4%
10 3575
3.0%
ValueCountFrequency (%)
31 2208
1.8%
30 3853
3.2%
29 3580
3.0%
28 3946
3.3%
27 3802
3.2%
26 4147
3.5%
25 4160
3.5%
24 3993
3.3%
23 3616
3.0%
22 3596
3.0%

num_weekend_nights
Real number (ℝ)

ZEROS 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.92759863
Minimum0
Maximum19
Zeros51998
Zeros (%)43.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:16.437710image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile2
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation0.99861349
Coefficient of variation (CV)1.0765578
Kurtosis7.1740661
Mean0.92759863
Median Absolute Deviation (MAD)1
Skewness1.3800464
Sum110746
Variance0.99722891
MonotonicityNot monotonic
2024-02-06T10:35:16.666152image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
0 51998
43.6%
2 33308
27.9%
1 30626
25.7%
4 1855
 
1.6%
3 1259
 
1.1%
6 153
 
0.1%
5 79
 
0.1%
8 60
 
0.1%
7 19
 
< 0.1%
9 11
 
< 0.1%
Other values (7) 22
 
< 0.1%
ValueCountFrequency (%)
0 51998
43.6%
1 30626
25.7%
2 33308
27.9%
3 1259
 
1.1%
4 1855
 
1.6%
5 79
 
0.1%
6 153
 
0.1%
7 19
 
< 0.1%
8 60
 
0.1%
9 11
 
< 0.1%
ValueCountFrequency (%)
19 1
 
< 0.1%
18 1
 
< 0.1%
16 3
 
< 0.1%
14 2
 
< 0.1%
13 3
 
< 0.1%
12 5
 
< 0.1%
10 7
 
< 0.1%
9 11
 
< 0.1%
8 60
0.1%
7 19
 
< 0.1%

num_workweek_nights
Real number (ℝ)

ZEROS 

Distinct35
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5003015
Minimum0
Maximum50
Zeros7645
Zeros (%)6.4%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:16.950146image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile5
Maximum50
Range50
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9082856
Coefficient of variation (CV)0.76322219
Kurtosis24.284555
Mean2.5003015
Median Absolute Deviation (MAD)1
Skewness2.8622492
Sum298511
Variance3.641554
MonotonicityNot monotonic
2024-02-06T10:35:17.241659image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
2 33684
28.2%
1 30310
25.4%
3 22258
18.6%
5 11077
 
9.3%
4 9563
 
8.0%
0 7645
 
6.4%
6 1499
 
1.3%
10 1036
 
0.9%
7 1029
 
0.9%
8 656
 
0.5%
Other values (25) 633
 
0.5%
ValueCountFrequency (%)
0 7645
 
6.4%
1 30310
25.4%
2 33684
28.2%
3 22258
18.6%
4 9563
 
8.0%
5 11077
 
9.3%
6 1499
 
1.3%
7 1029
 
0.9%
8 656
 
0.5%
9 231
 
0.2%
ValueCountFrequency (%)
50 1
 
< 0.1%
42 1
 
< 0.1%
41 1
 
< 0.1%
40 2
 
< 0.1%
35 1
 
< 0.1%
34 1
 
< 0.1%
33 1
 
< 0.1%
32 1
 
< 0.1%
30 5
< 0.1%
26 1
 
< 0.1%

num_adults
Real number (ℝ)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8564034
Minimum0
Maximum55
Zeros403
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:17.491098image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median2
Q32
95-th percentile3
Maximum55
Range55
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.579261
Coefficient of variation (CV)0.31203401
Kurtosis1352.1151
Mean1.8564034
Median Absolute Deviation (MAD)0
Skewness18.317805
Sum221636
Variance0.3355433
MonotonicityNot monotonic
2024-02-06T10:35:17.709419image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2 89680
75.1%
1 23027
 
19.3%
3 6202
 
5.2%
0 403
 
0.3%
4 62
 
0.1%
26 5
 
< 0.1%
27 2
 
< 0.1%
20 2
 
< 0.1%
5 2
 
< 0.1%
40 1
 
< 0.1%
Other values (4) 4
 
< 0.1%
ValueCountFrequency (%)
0 403
 
0.3%
1 23027
 
19.3%
2 89680
75.1%
3 6202
 
5.2%
4 62
 
0.1%
5 2
 
< 0.1%
6 1
 
< 0.1%
10 1
 
< 0.1%
20 2
 
< 0.1%
26 5
 
< 0.1%
ValueCountFrequency (%)
55 1
 
< 0.1%
50 1
 
< 0.1%
40 1
 
< 0.1%
27 2
 
< 0.1%
26 5
 
< 0.1%
20 2
 
< 0.1%
10 1
 
< 0.1%
6 1
 
< 0.1%
5 2
 
< 0.1%
4 62
0.1%

num_children
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size932.9 KiB
0.0
110796 
1.0
 
4861
2.0
 
3652
3.0
 
76
10.0
 
1

Length

Max length4
Median length3
Mean length3.0000084
Min length3

Characters and Unicode

Total characters358159
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 110796
92.8%
1.0 4861
 
4.1%
2.0 3652
 
3.1%
3.0 76
 
0.1%
10.0 1
 
< 0.1%
(Missing) 4
 
< 0.1%

Length

2024-02-06T10:35:17.995821image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:18.351195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0 110796
92.8%
1.0 4861
 
4.1%
2.0 3652
 
3.1%
3.0 76
 
0.1%
10.0 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 230183
64.3%
. 119386
33.3%
1 4862
 
1.4%
2 3652
 
1.0%
3 76
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 238773
66.7%
Other Punctuation 119386
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 230183
96.4%
1 4862
 
2.0%
2 3652
 
1.5%
3 76
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 119386
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 358159
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 230183
64.3%
. 119386
33.3%
1 4862
 
1.4%
2 3652
 
1.0%
3 76
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 358159
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 230183
64.3%
. 119386
33.3%
1 4862
 
1.4%
2 3652
 
1.0%
3 76
 
< 0.1%

num_babies
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
118473 
1
 
900
2
 
15
10
 
1
9
 
1

Length

Max length2
Median length1
Mean length1.0000084
Min length1

Characters and Unicode

Total characters119391
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 118473
99.2%
1 900
 
0.8%
2 15
 
< 0.1%
10 1
 
< 0.1%
9 1
 
< 0.1%

Length

2024-02-06T10:35:18.603172image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:18.900058image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 118473
99.2%
1 900
 
0.8%
2 15
 
< 0.1%
10 1
 
< 0.1%
9 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 118474
99.2%
1 901
 
0.8%
2 15
 
< 0.1%
9 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119391
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 118474
99.2%
1 901
 
0.8%
2 15
 
< 0.1%
9 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 119391
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 118474
99.2%
1 901
 
0.8%
2 15
 
< 0.1%
9 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119391
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 118474
99.2%
1 901
 
0.8%
2 15
 
< 0.1%
9 1
 
< 0.1%

breakfast
Boolean

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size116.7 KiB
True
92310 
False
27080 
ValueCountFrequency (%)
True 92310
77.3%
False 27080
 
22.7%
2024-02-06T10:35:19.137154image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Distinct177
Distinct (%)0.1%
Missing488
Missing (%)0.4%
Memory size932.9 KiB
2024-02-06T10:35:19.544603image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9892432
Min length2

Characters and Unicode

Total characters355427
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)< 0.1%

Sample

1st rowPRT
2nd rowPRT
3rd rowGBR
4th rowGBR
5th rowGBR
ValueCountFrequency (%)
prt 48590
40.9%
gbr 12129
 
10.2%
fra 10415
 
8.8%
esp 8568
 
7.2%
deu 7287
 
6.1%
ita 3766
 
3.2%
irl 3375
 
2.8%
bel 2342
 
2.0%
bra 2224
 
1.9%
nld 2104
 
1.8%
Other values (167) 18102
 
15.2%
2024-02-06T10:35:21.611088image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
R 80804
22.7%
P 58506
16.5%
T 54263
15.3%
A 21627
 
6.1%
E 21538
 
6.1%
B 17051
 
4.8%
S 13931
 
3.9%
U 13293
 
3.7%
G 13130
 
3.7%
F 10956
 
3.1%
Other values (16) 50328
14.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 355427
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R 80804
22.7%
P 58506
16.5%
T 54263
15.3%
A 21627
 
6.1%
E 21538
 
6.1%
B 17051
 
4.8%
S 13931
 
3.9%
U 13293
 
3.7%
G 13130
 
3.7%
F 10956
 
3.1%
Other values (16) 50328
14.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 355427
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R 80804
22.7%
P 58506
16.5%
T 54263
15.3%
A 21627
 
6.1%
E 21538
 
6.1%
B 17051
 
4.8%
S 13931
 
3.9%
U 13293
 
3.7%
G 13130
 
3.7%
F 10956
 
3.1%
Other values (16) 50328
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 355427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R 80804
22.7%
P 58506
16.5%
T 54263
15.3%
A 21627
 
6.1%
E 21538
 
6.1%
B 17051
 
4.8%
S 13931
 
3.9%
U 13293
 
3.7%
G 13130
 
3.7%
F 10956
 
3.1%
Other values (16) 50328
14.2%

market_segment
Real number (ℝ)

ZEROS 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4675768
Minimum0
Maximum7
Zeros12606
Zeros (%)10.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:21.990391image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median2
Q33
95-th percentile5
Maximum7
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4209671
Coefficient of variation (CV)0.57585524
Kurtosis-0.091205613
Mean2.4675768
Median Absolute Deviation (MAD)1
Skewness0.40380156
Sum294604
Variance2.0191474
MonotonicityNot monotonic
2024-02-06T10:35:22.340065image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2 56477
47.3%
3 24219
20.3%
5 19811
 
16.6%
0 12606
 
10.6%
1 5295
 
4.4%
4 743
 
0.6%
7 237
 
0.2%
6 2
 
< 0.1%
ValueCountFrequency (%)
0 12606
 
10.6%
1 5295
 
4.4%
2 56477
47.3%
3 24219
20.3%
4 743
 
0.6%
5 19811
 
16.6%
6 2
 
< 0.1%
7 237
 
0.2%
ValueCountFrequency (%)
7 237
 
0.2%
6 2
 
< 0.1%
5 19811
 
16.6%
4 743
 
0.6%
3 24219
20.3%
2 56477
47.3%
1 5295
 
4.4%
0 12606
 
10.6%

distribution_channel
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
2
97870 
0
14645 
1
 
6677
4
 
193
3
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

Length

2024-02-06T10:35:22.728845image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:23.213809image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 97870
82.0%
0 14645
 
12.3%
1 6677
 
5.6%
4 193
 
0.2%
3 5
 
< 0.1%

repeated_guest
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
115580 
1
 
3810

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

Length

2024-02-06T10:35:23.541295image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:23.781975image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

Most occurring characters

ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common 119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 115580
96.8%
1 3810
 
3.2%

num_previous_cancellations
Real number (ℝ)

SKEWED  ZEROS 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.087117849
Minimum0
Maximum26
Zeros112906
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:23.971238image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum26
Range26
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.84433638
Coefficient of variation (CV)9.6918874
Kurtosis674.07369
Mean0.087117849
Median Absolute Deviation (MAD)0
Skewness24.458049
Sum10401
Variance0.71290393
MonotonicityNot monotonic
2024-02-06T10:35:24.221146image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 112906
94.6%
1 6051
 
5.1%
2 116
 
0.1%
3 65
 
0.1%
24 48
 
< 0.1%
11 35
 
< 0.1%
4 31
 
< 0.1%
26 26
 
< 0.1%
25 25
 
< 0.1%
6 22
 
< 0.1%
Other values (5) 65
 
0.1%
ValueCountFrequency (%)
0 112906
94.6%
1 6051
 
5.1%
2 116
 
0.1%
3 65
 
0.1%
4 31
 
< 0.1%
5 19
 
< 0.1%
6 22
 
< 0.1%
11 35
 
< 0.1%
13 12
 
< 0.1%
14 14
 
< 0.1%
ValueCountFrequency (%)
26 26
< 0.1%
25 25
< 0.1%
24 48
< 0.1%
21 1
 
< 0.1%
19 19
 
< 0.1%
14 14
 
< 0.1%
13 12
 
< 0.1%
11 35
< 0.1%
6 22
< 0.1%
5 19
 
< 0.1%

num_previous_stays
Real number (ℝ)

SKEWED  ZEROS 

Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.13709691
Minimum0
Maximum72
Zeros115770
Zeros (%)97.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:24.498399image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum72
Range72
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.4974368
Coefficient of variation (CV)10.92247
Kurtosis767.24521
Mean0.13709691
Median Absolute Deviation (MAD)0
Skewness23.5398
Sum16368
Variance2.2423171
MonotonicityNot monotonic
2024-02-06T10:35:24.798194image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 115770
97.0%
1 1542
 
1.3%
2 580
 
0.5%
3 333
 
0.3%
4 229
 
0.2%
5 181
 
0.2%
6 115
 
0.1%
7 88
 
0.1%
8 70
 
0.1%
9 60
 
0.1%
Other values (63) 422
 
0.4%
ValueCountFrequency (%)
0 115770
97.0%
1 1542
 
1.3%
2 580
 
0.5%
3 333
 
0.3%
4 229
 
0.2%
5 181
 
0.2%
6 115
 
0.1%
7 88
 
0.1%
8 70
 
0.1%
9 60
 
0.1%
ValueCountFrequency (%)
72 1
< 0.1%
71 1
< 0.1%
70 1
< 0.1%
69 1
< 0.1%
68 1
< 0.1%
67 1
< 0.1%
66 1
< 0.1%
65 1
< 0.1%
64 1
< 0.1%
63 1
< 0.1%

reserved_room
Categorical

IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
A
85994 
D
19201 
E
 
6535
F
 
2897
G
 
2094
Other values (5)
 
2669

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 85994
72.0%
D 19201
 
16.1%
E 6535
 
5.5%
F 2897
 
2.4%
G 2094
 
1.8%
B 1118
 
0.9%
C 932
 
0.8%
H 601
 
0.5%
P 12
 
< 0.1%
L 6
 
< 0.1%

Length

2024-02-06T10:35:25.104048image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:25.402379image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
a 85994
72.0%
d 19201
 
16.1%
e 6535
 
5.5%
f 2897
 
2.4%
g 2094
 
1.8%
b 1118
 
0.9%
c 932
 
0.8%
h 601
 
0.5%
p 12
 
< 0.1%
l 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
A 85994
72.0%
D 19201
 
16.1%
E 6535
 
5.5%
F 2897
 
2.4%
G 2094
 
1.8%
B 1118
 
0.9%
C 932
 
0.8%
H 601
 
0.5%
P 12
 
< 0.1%
L 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 119390
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 85994
72.0%
D 19201
 
16.1%
E 6535
 
5.5%
F 2897
 
2.4%
G 2094
 
1.8%
B 1118
 
0.9%
C 932
 
0.8%
H 601
 
0.5%
P 12
 
< 0.1%
L 6
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 119390
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 85994
72.0%
D 19201
 
16.1%
E 6535
 
5.5%
F 2897
 
2.4%
G 2094
 
1.8%
B 1118
 
0.9%
C 932
 
0.8%
H 601
 
0.5%
P 12
 
< 0.1%
L 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 85994
72.0%
D 19201
 
16.1%
E 6535
 
5.5%
F 2897
 
2.4%
G 2094
 
1.8%
B 1118
 
0.9%
C 932
 
0.8%
H 601
 
0.5%
P 12
 
< 0.1%
L 6
 
< 0.1%

changes_between_booking_arrival
Real number (ℝ)

ZEROS 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22112405
Minimum0
Maximum21
Zeros101314
Zeros (%)84.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:25.673764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum21
Range21
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.65230557
Coefficient of variation (CV)2.9499531
Kurtosis79.393605
Mean0.22112405
Median Absolute Deviation (MAD)0
Skewness6.0002701
Sum26400
Variance0.42550256
MonotonicityNot monotonic
2024-02-06T10:35:25.910034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 101314
84.9%
1 12701
 
10.6%
2 3805
 
3.2%
3 927
 
0.8%
4 376
 
0.3%
5 118
 
0.1%
6 63
 
0.1%
7 31
 
< 0.1%
8 17
 
< 0.1%
9 8
 
< 0.1%
Other values (11) 30
 
< 0.1%
ValueCountFrequency (%)
0 101314
84.9%
1 12701
 
10.6%
2 3805
 
3.2%
3 927
 
0.8%
4 376
 
0.3%
5 118
 
0.1%
6 63
 
0.1%
7 31
 
< 0.1%
8 17
 
< 0.1%
9 8
 
< 0.1%
ValueCountFrequency (%)
21 1
 
< 0.1%
20 1
 
< 0.1%
18 1
 
< 0.1%
17 2
 
< 0.1%
16 2
 
< 0.1%
15 3
< 0.1%
14 5
< 0.1%
13 5
< 0.1%
12 2
 
< 0.1%
11 2
 
< 0.1%

deposit_policy
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
No Deposit
104641 
Non Refund
14587 
Refundable
 
162

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1193900
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo Deposit
2nd rowNo Deposit
3rd rowNo Deposit
4th rowNo Deposit
5th rowNo Deposit

Common Values

ValueCountFrequency (%)
No Deposit 104641
87.6%
Non Refund 14587
 
12.2%
Refundable 162
 
0.1%

Length

2024-02-06T10:35:26.183038image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:26.434901image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
no 104641
43.9%
deposit 104641
43.9%
non 14587
 
6.1%
refund 14587
 
6.1%
refundable 162
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 223869
18.8%
e 119552
10.0%
N 119228
10.0%
119228
10.0%
s 104641
8.8%
i 104641
8.8%
t 104641
8.8%
p 104641
8.8%
D 104641
8.8%
n 29336
 
2.5%
Other values (7) 59482
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 836054
70.0%
Uppercase Letter 238618
 
20.0%
Space Separator 119228
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 223869
26.8%
e 119552
14.3%
s 104641
12.5%
i 104641
12.5%
t 104641
12.5%
p 104641
12.5%
n 29336
 
3.5%
f 14749
 
1.8%
u 14749
 
1.8%
d 14749
 
1.8%
Other values (3) 486
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 119228
50.0%
D 104641
43.9%
R 14749
 
6.2%
Space Separator
ValueCountFrequency (%)
119228
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1074672
90.0%
Common 119228
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 223869
20.8%
e 119552
11.1%
N 119228
11.1%
s 104641
9.7%
i 104641
9.7%
t 104641
9.7%
p 104641
9.7%
D 104641
9.7%
n 29336
 
2.7%
R 14749
 
1.4%
Other values (6) 44733
 
4.2%
Common
ValueCountFrequency (%)
119228
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1193900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 223869
18.8%
e 119552
10.0%
N 119228
10.0%
119228
10.0%
s 104641
8.8%
i 104641
8.8%
t 104641
8.8%
p 104641
8.8%
D 104641
8.8%
n 29336
 
2.5%
Other values (7) 59482
 
5.0%

id_travel_agency_booking
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct333
Distinct (%)0.3%
Missing16340
Missing (%)13.7%
Infinite0
Infinite (%)0.0%
Mean86.693382
Minimum1
Maximum535
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:26.678466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q19
median14
Q3229
95-th percentile250
Maximum535
Range534
Interquartile range (IQR)220

Descriptive statistics

Standard deviation110.77455
Coefficient of variation (CV)1.277774
Kurtosis-0.0071795649
Mean86.693382
Median Absolute Deviation (MAD)13
Skewness1.0893856
Sum8933753
Variance12271
MonotonicityNot monotonic
2024-02-06T10:35:26.988571image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9 31961
26.8%
240 13922
11.7%
1 7191
 
6.0%
14 3640
 
3.0%
7 3539
 
3.0%
6 3290
 
2.8%
250 2870
 
2.4%
241 1721
 
1.4%
28 1666
 
1.4%
8 1514
 
1.3%
Other values (323) 31736
26.6%
(Missing) 16340
13.7%
ValueCountFrequency (%)
1 7191
 
6.0%
2 162
 
0.1%
3 1336
 
1.1%
4 47
 
< 0.1%
5 330
 
0.3%
6 3290
 
2.8%
7 3539
 
3.0%
8 1514
 
1.3%
9 31961
26.8%
10 260
 
0.2%
ValueCountFrequency (%)
535 3
 
< 0.1%
531 68
0.1%
527 35
< 0.1%
526 10
 
< 0.1%
510 2
 
< 0.1%
509 10
 
< 0.1%
508 6
 
< 0.1%
502 24
 
< 0.1%
497 1
 
< 0.1%
495 57
< 0.1%

id_person_booking
Real number (ℝ)

MISSING 

Distinct352
Distinct (%)5.2%
Missing112593
Missing (%)94.3%
Infinite0
Infinite (%)0.0%
Mean189.26674
Minimum6
Maximum543
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:27.319998image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile40
Q162
median179
Q3270
95-th percentile435
Maximum543
Range537
Interquartile range (IQR)208

Descriptive statistics

Standard deviation131.65501
Coefficient of variation (CV)0.69560567
Kurtosis-0.49079521
Mean189.26674
Median Absolute Deviation (MAD)111
Skewness0.60159967
Sum1286446
Variance17333.043
MonotonicityNot monotonic
2024-02-06T10:35:27.632721image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40 927
 
0.8%
223 784
 
0.7%
67 267
 
0.2%
45 250
 
0.2%
153 215
 
0.2%
174 149
 
0.1%
219 141
 
0.1%
281 138
 
0.1%
154 133
 
0.1%
405 119
 
0.1%
Other values (342) 3674
 
3.1%
(Missing) 112593
94.3%
ValueCountFrequency (%)
6 1
 
< 0.1%
8 1
 
< 0.1%
9 37
< 0.1%
10 1
 
< 0.1%
11 1
 
< 0.1%
12 14
 
< 0.1%
14 9
 
< 0.1%
16 5
 
< 0.1%
18 1
 
< 0.1%
20 50
< 0.1%
ValueCountFrequency (%)
543 2
 
< 0.1%
541 1
 
< 0.1%
539 2
 
< 0.1%
534 2
 
< 0.1%
531 1
 
< 0.1%
530 5
 
< 0.1%
528 2
 
< 0.1%
525 15
< 0.1%
523 19
< 0.1%
521 7
 
< 0.1%

customer_type
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
89613 
2
25124 
1
 
4076
3
 
577

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

Length

2024-02-06T10:35:27.912141image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:28.183807image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common 119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 89613
75.1%
2 25124
 
21.0%
1 4076
 
3.4%
3 577
 
0.5%

avg_price
Real number (ℝ)

ZEROS 

Distinct8726
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.71874
Minimum0
Maximum300
Zeros1960
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:28.458583image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile38.4
Q169.29
median94.575
Q3126
95-th percentile193.5
Maximum300
Range300
Interquartile range (IQR)56.71

Descriptive statistics

Standard deviation47.823771
Coefficient of variation (CV)0.47015691
Kurtosis1.5972017
Mean101.71874
Median Absolute Deviation (MAD)27.825
Skewness0.94134242
Sum12144201
Variance2287.1131
MonotonicityNot monotonic
2024-02-06T10:35:28.755018image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62 3754
 
3.1%
75 2715
 
2.3%
90 2473
 
2.1%
65 2418
 
2.0%
0 1960
 
1.6%
80 1889
 
1.6%
95 1661
 
1.4%
120 1607
 
1.3%
100 1573
 
1.3%
85 1538
 
1.3%
Other values (8716) 97802
81.9%
ValueCountFrequency (%)
0 1960
1.6%
0.26 1
 
< 0.1%
0.5 1
 
< 0.1%
1 15
 
< 0.1%
1.29 1
 
< 0.1%
1.48 1
 
< 0.1%
1.56 2
 
< 0.1%
1.6 1
 
< 0.1%
1.8 1
 
< 0.1%
2 12
 
< 0.1%
ValueCountFrequency (%)
300 290
0.2%
299.43 1
 
< 0.1%
299.33 2
 
< 0.1%
299.2 1
 
< 0.1%
299 10
 
< 0.1%
298.71 1
 
< 0.1%
298 5
 
< 0.1%
297.57 1
 
< 0.1%
297.5 1
 
< 0.1%
297.38 1
 
< 0.1%

required_car_parking_spaces
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size932.9 KiB
0
111974 
1
 
7383
2
 
28
3
 
3
8
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters119390
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

Length

2024-02-06T10:35:29.041755image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-06T10:35:29.333668image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 119390
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 119390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 111974
93.8%
1 7383
 
6.2%
2 28
 
< 0.1%
3 3
 
< 0.1%
8 2
 
< 0.1%

total_of_special_requests
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.57136276
Minimum0
Maximum5
Zeros70318
Zeros (%)58.9%
Negative0
Negative (%)0.0%
Memory size932.9 KiB
2024-02-06T10:35:29.537335image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.79279842
Coefficient of variation (CV)1.387557
Kurtosis1.4925648
Mean0.57136276
Median Absolute Deviation (MAD)0
Skewness1.3491894
Sum68215
Variance0.62852934
MonotonicityNot monotonic
2024-02-06T10:35:29.766413image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 70318
58.9%
1 33226
27.8%
2 12969
 
10.9%
3 2497
 
2.1%
4 340
 
0.3%
5 40
 
< 0.1%
ValueCountFrequency (%)
0 70318
58.9%
1 33226
27.8%
2 12969
 
10.9%
3 2497
 
2.1%
4 340
 
0.3%
5 40
 
< 0.1%
ValueCountFrequency (%)
5 40
 
< 0.1%
4 340
 
0.3%
3 2497
 
2.1%
2 12969
 
10.9%
1 33226
27.8%
0 70318
58.9%

Interactions

2024-02-06T10:35:04.887162image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:06.448782image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:12.396962image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:17.264829image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:21.068872image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:25.113955image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:30.393329image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:34.096712image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:39.548790image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:44.268834image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:48.113579image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:52.610788image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:57.450381image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:01.159006image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:05.167101image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:06.752835image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:12.988841image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:17.538218image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:21.369983image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:25.538334image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:30.658789image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:34.376149image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:40.173223image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:44.566644image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:48.379699image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:53.020965image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:57.714531image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:01.434596image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:05.426960image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:07.591830image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:13.758533image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:17.803844image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:21.636225image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:25.937153image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:30.929081image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:34.647013image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:40.585324image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:44.835392image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:48.647285image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:53.419722image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:57.980858image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:01.704643image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:05.702521image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:07.885925image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:14.114904image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:18.077337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:21.916406image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:26.350226image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:31.189479image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:34.923702image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:40.998896image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:45.108605image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:48.902354image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:53.730539image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:58.242430image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:01.974750image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:05.984675image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:08.456826image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:14.534207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:18.369724image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:22.195497image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:26.773879image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:31.471932image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:35.210347image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:41.397665image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:45.395345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:49.180284image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:54.171998image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:58.504766image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:02.267479image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:06.340690image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:08.737477image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:14.827776image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:18.628760image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:22.477450image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:27.153478image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:31.723465image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:35.522626image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:41.766732image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:45.678435image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:49.435180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:54.575469image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:58.762128image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:02.528894image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:06.677896image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:09.267374image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:15.095842image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:18.888732image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:22.740956image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:27.546061image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:31.974884image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:35.888912image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:42.134969image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:45.945979image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:49.703753image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:54.966476image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:59.055757image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:02.788146image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:07.095565image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:09.552228image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:15.387290image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:19.171345image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:23.019732image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:28.573243image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:32.247146image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:36.372488image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:42.411065image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:46.218952image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:49.976298image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:55.397037image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:59.324072image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:03.055441image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:07.504187image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:09.959500image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:15.638424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:19.436739image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:23.291526image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:28.837861image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:32.503109image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:36.644871image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:42.665916image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:46.509856image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:50.228836image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:55.818155image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:59.584123image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:03.328145image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:07.925455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:10.420587image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:15.911956image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:19.716393image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:23.574266image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:29.105130image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:32.773576image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:37.069215image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:42.950276image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:46.784201image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:50.497380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:56.129089image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:59.844544image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:03.598741image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:08.338207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:10.727537image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:16.196018image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:19.987737image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:23.838979image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:29.353534image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:33.040754image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:37.495247image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:43.205589image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:47.046015image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:50.764670image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:56.398779image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:00.114365image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:03.844764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:08.735749image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:11.139529image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:16.443779image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:20.245135image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:24.111337image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:29.588842image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:33.285283image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:38.015026image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:43.482652image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:47.299964image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:51.041291image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:56.642066image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:00.340559image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:04.098552image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:09.086734image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:11.488545image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:16.711659image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:20.524480image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:24.394293image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:29.870696image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:33.556703image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:38.338753image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:43.746796image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:47.572570image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:51.321563image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:56.924571image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:00.589499image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:04.367594image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:09.496469image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:11.894698image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:16.977398image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:20.800399image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:24.695099image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:30.134184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:33.819659image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:38.778887image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:44.006815image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:47.846485image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:52.247873image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:34:57.192811image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:00.847492image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2024-02-06T10:35:04.630970image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2024-02-06T10:35:30.043724image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
avg_pricebreakfastcancellationchanges_between_booking_arrivalcustomer_typeday_of_month_arrival_datedays_between_booking_arrivaldeposit_policydistribution_channelid_person_bookingid_travel_agency_bookingmarket_segmentmonth_arrival_datenum_adultsnum_babiesnum_childrennum_previous_cancellationsnum_previous_staysnum_weekend_nightsnum_workweek_nightsrepeated_guestrequired_car_parking_spacesreserved_roomtotal_of_special_requeststypeweek_number_arrival_dateyear_arrival_date
avg_price1.0000.1350.1180.0050.1260.0270.0150.1310.1190.052-0.049-0.2590.1810.2800.0220.204-0.150-0.1430.0510.0940.1900.0550.1910.1960.4710.0740.153
breakfast0.1351.0000.013-0.0490.0780.009-0.0630.0720.121-0.0360.036-0.0650.081-0.0510.0090.0370.0330.066-0.061-0.0430.0600.0320.121-0.0210.0410.0000.039
cancellation0.1180.0131.000-0.1850.136-0.0060.3170.4810.177-0.011-0.1150.2150.0700.0670.0340.0280.270-0.115-0.0040.0410.0850.1970.073-0.2590.1360.0080.026
changes_between_booking_arrival0.005-0.049-0.1851.0000.0280.012-0.0080.0290.0270.1760.091-0.0710.010-0.0850.0170.018-0.0730.0310.0400.0640.0000.0160.0140.0420.0400.0080.016
customer_type0.1260.0780.1360.0281.0000.0020.1590.0980.0790.310-0.0420.3730.103-0.1240.0150.0610.096-0.036-0.035-0.0280.1050.0410.109-0.1460.0520.0760.213
day_of_month_arrival_date0.0270.009-0.0060.0120.0021.0000.0080.0540.0280.0460.0050.0010.0580.0020.0050.010-0.012-0.001-0.007-0.0160.0170.0080.0100.0030.0260.0610.044
days_between_booking_arrival0.015-0.0630.317-0.0080.1590.0081.0000.2730.1160.286-0.1230.4090.1320.1920.0070.0280.171-0.1890.1620.2960.1340.0570.048-0.0740.0940.1130.104
deposit_policy0.1310.0720.4810.0290.0980.0540.2731.0000.0910.022-0.1370.4550.101-0.0290.0230.0730.318-0.064-0.116-0.0550.0580.0710.152-0.3020.1770.0060.052
distribution_channel0.1190.1210.1770.0270.0790.0280.1160.0911.0000.136-0.2180.4810.0690.1570.0290.0430.020-0.2450.0870.1020.2970.0760.1000.0910.1870.0090.027
id_person_booking0.052-0.036-0.0110.1760.3100.0460.2860.0220.1361.0000.2260.1960.2170.2300.0320.039-0.198-0.2980.0760.2500.3580.0480.098-0.1280.498-0.0580.281
id_travel_agency_booking-0.0490.036-0.1150.091-0.0420.005-0.123-0.137-0.2180.2261.000-0.1160.083-0.0560.0260.058-0.1680.0600.1310.1700.0760.1310.1430.0150.817-0.0570.091
market_segment-0.259-0.0650.215-0.0710.3730.0010.4090.4550.4810.196-0.1161.0000.088-0.0170.0340.1000.194-0.1530.0110.0350.3470.0920.138-0.2930.1470.0480.159
month_arrival_date0.1810.0810.0700.0100.1030.0580.1320.1010.0690.2170.0830.0881.000-0.0780.0160.0690.045-0.007-0.035-0.0240.0750.0180.045-0.0580.0700.3370.429
num_adults0.280-0.0510.067-0.085-0.1240.0020.192-0.0290.1570.230-0.056-0.017-0.0781.0000.0000.000-0.036-0.2100.1270.1530.0000.0000.0030.1620.0140.0260.015
num_babies0.0220.0090.0340.0170.0150.0050.0070.0230.0290.0320.0260.0340.0160.0001.0000.025-0.017-0.0110.0230.0260.0070.0200.0400.0930.0490.0130.009
num_children0.2040.0370.0280.0180.0610.0100.0280.0730.0430.0390.0580.1000.0690.0000.0251.000-0.059-0.0350.0530.0540.0350.0300.3570.0960.0460.0060.044
num_previous_cancellations-0.1500.0330.270-0.0730.096-0.0120.1710.3180.020-0.198-0.1680.1940.045-0.036-0.017-0.0591.0000.102-0.055-0.0620.1850.0000.006-0.1290.0500.0870.052
num_previous_stays-0.1430.066-0.1150.031-0.036-0.001-0.189-0.064-0.245-0.2980.060-0.153-0.007-0.210-0.011-0.0350.1021.000-0.084-0.1190.3200.0190.0030.0250.017-0.0430.025
num_weekend_nights0.051-0.061-0.0040.040-0.035-0.0070.162-0.1160.0870.0760.1310.011-0.0350.1270.0230.053-0.055-0.0841.0000.2380.0820.0150.0540.0790.1980.0260.029
num_workweek_nights0.094-0.0430.0410.064-0.028-0.0160.296-0.0550.1020.2500.1700.035-0.0240.1530.0260.054-0.062-0.1190.2381.0000.0170.0170.0440.0760.1920.0260.014
repeated_guest0.1900.0600.0850.0000.1050.0170.1340.0580.2970.3580.0760.3470.0750.0000.0070.0350.1850.3200.0820.0171.0000.0780.0370.0060.050-0.0300.010
required_car_parking_spaces0.0550.0320.1970.0160.0410.0080.0570.0710.0760.0480.1310.0920.0180.0000.0200.0300.0000.0190.0150.0170.0781.0000.0790.0880.2210.0030.018
reserved_room0.1910.1210.0730.0140.1090.0100.0480.1520.1000.0980.1430.1380.0450.0030.0400.3570.0060.0030.0540.0440.0370.0791.0000.1520.323-0.0110.082
total_of_special_requests0.196-0.021-0.2590.042-0.1460.003-0.074-0.3020.091-0.1280.015-0.293-0.0580.1620.0930.096-0.1290.0250.0790.0760.0060.0880.1521.0000.0460.0190.091
type0.4710.0410.1360.0400.0520.0260.0940.1770.1870.4980.8170.1470.0700.0140.0490.0460.0500.0170.1980.1920.0500.2210.3230.0461.0000.0010.043
week_number_arrival_date0.0740.0000.0080.0080.0760.0610.1130.0060.009-0.058-0.0570.0480.3370.0260.0130.0060.087-0.0430.0260.026-0.0300.003-0.0110.0190.0011.0000.424
year_arrival_date0.1530.0390.0260.0160.2130.0440.1040.0520.0270.2810.0910.1590.4290.0150.0090.0440.0520.0250.0290.0140.0100.0180.0820.0910.0430.4241.000

Missing values

2024-02-06T10:35:10.118095image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-06T10:35:11.227518image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-02-06T10:35:12.342873image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

cancellationtypedays_between_booking_arrivalyear_arrival_datemonth_arrival_dateweek_number_arrival_dateday_of_month_arrival_datenum_weekend_nightsnum_workweek_nightsnum_adultsnum_childrennum_babiesbreakfastcountrymarket_segmentdistribution_channelrepeated_guestnum_previous_cancellationsnum_previous_staysreserved_roomchanges_between_booking_arrivaldeposit_policyid_travel_agency_bookingid_person_bookingcustomer_typeavg_pricerequired_car_parking_spacestotal_of_special_requests
00Fancy Hotel3422015July2710020.00TruePRT00000C3No DepositNaNNaN00.000
10Fancy Hotel7372015July2710020.00TruePRT00000C4No DepositNaNNaN00.000
20Fancy Hotel72015July2710110.00TrueGBR00000A0No DepositNaNNaN075.000
30Fancy Hotel132015July2710110.00TrueGBR11000A0No Deposit304.0NaN075.000
40Fancy Hotel142015July2710220.00TrueGBR22000A0No Deposit240.0NaN098.001
50Fancy Hotel142015July2710220.00TrueGBR22000A0No Deposit240.0NaN098.001
60Fancy Hotel02015July2710220.00TruePRT00000C0No DepositNaNNaN0107.000
70Fancy Hotel92015July2710220.00FalsePRT00000C0No Deposit303.0NaN0103.001
81Fancy Hotel852015July2710320.00TruePRT22000A0No Deposit240.0NaN082.001
91Fancy Hotel752015July2710320.00FalsePRT32000D0No Deposit15.0NaN0105.500
cancellationtypedays_between_booking_arrivalyear_arrival_datemonth_arrival_dateweek_number_arrival_dateday_of_month_arrival_datenum_weekend_nightsnum_workweek_nightsnum_adultsnum_childrennum_babiesbreakfastcountrymarket_segmentdistribution_channelrepeated_guestnum_previous_cancellationsnum_previous_staysreserved_roomchanges_between_booking_arrivaldeposit_policyid_travel_agency_bookingid_person_bookingcustomer_typeavg_pricerequired_car_parking_spacestotal_of_special_requests
1193800Hotel442017August35311320.00FalseDEU22000A0No Deposit9.0NaN0140.7501
1193810Hotel1882017August35312320.00TrueDEU00000A0No Deposit14.0NaN099.0000
1193820Hotel1352017August35302430.00TrueJPN22000G0No Deposit7.0NaN0209.0000
1193830Hotel1642017August35312420.00TrueDEU32000A0No Deposit42.0NaN087.6000
1193840Hotel212017August35302520.00TrueBEL32000A0No Deposit394.0NaN096.1402
1193850Hotel232017August35302520.00TrueBEL32000A0No Deposit394.0NaN096.1400
1193860Hotel1022017August35312530.00TrueFRA22000E0No Deposit9.0NaN0225.4302
1193870Hotel342017August35312520.00TrueDEU22000D0No Deposit9.0NaN0157.7104
1193880Hotel1092017August35312520.00TrueGBR22000A0No Deposit89.0NaN0104.4000
1193890Hotel2052017August35292720.00FalseDEU22000A0No Deposit9.0NaN0151.2002

Duplicate rows

Most frequently occurring

cancellationtypedays_between_booking_arrivalyear_arrival_datemonth_arrival_dateweek_number_arrival_dateday_of_month_arrival_datenum_weekend_nightsnum_workweek_nightsnum_adultsnum_childrennum_babiesbreakfastcountrymarket_segmentdistribution_channelrepeated_guestnum_previous_cancellationsnum_previous_staysreserved_roomchanges_between_booking_arrivaldeposit_policyid_travel_agency_bookingid_person_bookingcustomer_typeavg_pricerequired_car_parking_spacestotal_of_special_requests# duplicates
78051Hotel2772016November4671220.00TruePRT52000A0Non RefundNaNNaN0100.000180
65871Hotel682016February8170220.00TruePRT52010A0Non Refund37.0NaN075.000150
62441Hotel342015December5080210.00TruePRT32010A0Non Refund19.0NaN090.000140
74761Hotel1882016June25150210.00TruePRT32000A0Non Refund119.0NaN0130.000109
72851Hotel1582016May22240210.00TruePRT52000A0Non Refund37.0NaN0130.000101
61871Hotel282017March920320.00TruePRT52000A0Non RefundNaNNaN095.00099
63071Hotel382017January2140110.00TruePRT11000A0Non RefundNaN67.0075.00099
72781Hotel1562017April17260320.00TruePRT52000A0Non Refund37.0NaN0100.00099
66121Hotel712016June25140310.00TruePRT32000A0Non Refund236.0NaN0120.00089
42120Hotel1642015October4020210.00TruePRT32000A0No Deposit19.0NaN2100.00087